Obtaining better quality final clustering by merging a collection of clusterings
نویسندگان
چکیده
MOTIVATION Clustering methods including k-means, SOM, UPGMA, DAA, CLICK, GENECLUSTER, CAST, DHC, PMETIS and KMETIS have been widely used in biological studies for gene expression, protein localization, sequence recognition and more. All these clustering methods have some benefits and drawbacks. We propose a novel graph-based clustering software called COMUSA for combining the benefits of a collection of clusterings into a final clustering having better overall quality. RESULTS COMUSA implementation is compared with PMETIS, KMETIS and k-means. Experimental results on artificial, real and biological datasets demonstrate the effectiveness of our method. COMUSA produces very good quality clusters in a short amount of time. AVAILABILITY http://www.cs.umb.edu/∼smimarog/comusa CONTACT [email protected]
منابع مشابه
انتخاب اعضای ترکیب در خوشهبندی ترکیبی با استفاده از رأیگیری
Clustering is the process of division of a dataset into subsets that are called clusters, so that objects within a cluster are similar to each other and different from objects of the other clusters. So far, a lot of algorithms in different approaches have been created for the clustering. An effective choice (can combine) two or more of these algorithms for solving the clustering problem. Ensemb...
متن کاملWeighted Ensemble Clustering for Increasing the Accuracy of the Final Clustering
Clustering algorithms are highly dependent on different factors such as the number of clusters, the specific clustering algorithm, and the used distance measure. Inspired from ensemble classification, one approach to reduce the effect of these factors on the final clustering is ensemble clustering. Since weighting the base classifiers has been a successful idea in ensemble classification, in th...
متن کاملCombining multiple clusterings using similarity graph
Multiple clusterings are produced for various needs and reasons in both distributed and local environments. Combining multiple clusterings into a final clustering which has better overall quality has gained importance recently. It is also expected that the final clustering is novel, robust, and scalable. In order to solve this challenging problem we introduce a new graph-based method. Our metho...
متن کاملConsensus Based Ensembles of Soft Clusterings
Cluster Ensembles is a framework for combining multiple partitionings obtained from separate clustering runs into a final consensus clustering. This framework has attracted much interest recently because of its numerous practical applications, and a variety of approaches including Graph Partitioning, Maximum Likelihood, Genetic algorithms, and Voting-Merging have been proposed. The vast majorit...
متن کاملClustering Learning Objects Collections Using Cluster Ensembles
Learning Object Repositories are increasingly being used in learning systems to provide high-quality, reusable educational materials. A relevant data mining problem associated with the automatic categorization of learning objects is the discovery of intrinsic classes based on the textual contents of the meta-data records. In this paper, we present a cluster ensemble method, that is applicable f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 26 20 شماره
صفحات -
تاریخ انتشار 2010